In [3]:
import os
import pandas as pd
On first week of the social media scraping tooltrack we used Netvizz.
It has six different modules
We received a couple of files for each page. For example:
In [4]:
ethospagedatadir = "page_126517697403099_2017_10_31_14_48_28"
In [5]:
os.listdir(ethospagedatadir)
Out[5]:
And we explored the shape and nature of the data, to see what did we receive from Netvizz.
The .gdf
graph file data has nodes
In [67]:
ethosg = {'nodes': pd.read_csv(ethospagedatadir + '/' + 'page_126517697403099_2017_10_31_14_48_28.gdf', nrows=416),
'edges': pd.read_csv(ethospagedatadir + '/' + 'page_126517697403099_2017_10_31_14_48_28.gdf', skiprows=417)
}
In [68]:
ethosg['nodes'].sample(3)
Out[68]:
and edges in the same file
In [63]:
ethosg['edges'].sample(3)
Out[63]:
And the data in the fullstats
data frame looks like this.
In [ ]:
ethosstats = pd.read_csv(ethospagedatadir + '/' + 'page_126517697403099_2017_10_31_14_48_28_fullstats.tab', sep='\t')
In [65]:
ethosstats.sample(3)
Out[65]:
And the data in the comments
data frame looks like this.
In [ ]:
ethoscomments = pd.read_csv(ethospagedatadir + '/' + 'page_126517697403099_2017_10_31_14_48_28_comments.tab', sep='\t')
In [66]:
ethoscomments.sample(3)
Out[66]:
Netvizz is described in Rieder's paper Studying Facebook via Data Extraction: The Netvizz Application (WebSci'13).
We can use Tableau, Gephi or other tools to merge and/or manipulate the data, and Table 2 Net to define other graphs.
The point was to be able to bring these possibilities to our project groups, and raise some questions.